Goto

Collaborating Authors

 valence and arousal


Saliency-guided Emotion Modeling: Predicting Viewer Reactions from Video Stimuli

Yaragoppa, Akhila, Siddharth, null

arXiv.org Artificial Intelligence

Understanding the emotional impact of videos is crucial for applications in content creation, advertising, and Human-Computer Interaction (HCI). Traditional affective computing methods rely on self-reported emotions, facial expression analysis, and biosensing data, yet they often overlook the role of visual saliency -- the naturally attention-grabbing regions within a video. In this study, we utilize deep learning to introduce a novel saliency-based approach to emotion prediction by extracting two key features: saliency area and number of salient regions. Using the HD2S saliency model and OpenFace facial action unit analysis, we examine the relationship between video saliency and viewer emotions. Our findings reveal three key insights: (1) Videos with multiple salient regions tend to elicit high-valence, low-arousal emotions, (2) Videos with a single dominant salient region are more likely to induce low-valence, high-arousal responses, and (3) Self-reported emotions often misalign with facial expression-based emotion detection, suggesting limitations in subjective reporting. By leveraging saliency-driven insights, this work provides a computationally efficient and interpretable alternative for emotion modeling, with implications for content creation, personalized media experiences, and affective computing research.


Don't Get Too Excited -- Eliciting Emotions in LLMs

Fazzi, Gino Franco, Hinge, Julie Skoven, Heinrich, Stefan, Burelli, Paolo

arXiv.org Artificial Intelligence

This paper investigates the challenges of affect control in large language models (LLMs), focusing on their ability to express appropriate emotional states during extended dialogues. We evaluated state-of-the-art open-weight LLMs to assess their affective expressive range in terms of arousal and valence. Our study employs a novel methodology combining LLM-based sentiment analysis with multiturn dialogue simulations between LLMs. We quantify the models' capacity to express a wide spectrum of emotions and how they fluctuate during interactions. Our findings reveal significant variations among LLMs in their ability to maintain consistent affect, with some models demonstrating more stable emotional trajectories than others. Furthermore, we identify key challenges in affect control, including difficulties in producing and maintaining extreme emotional states and limitations in adapting affect to changing conversational contexts. These findings have important implications for the development of more emotionally intelligent AI systems and highlight the need for improved affect modelling in LLMs.


Using large language models to estimate features of multi-word expressions: Concreteness, valence, arousal

Martínez, Gonzalo, Molero, Juan Diego, González, Sandra, Conde, Javier, Brysbaert, Marc, Reviriego, Pedro

arXiv.org Artificial Intelligence

This study investigates the potential of large language models (LLMs) to provide accurate estimates of concreteness, valence and arousal for multi-word expressions. Unlike previous artificial intelligence (AI) methods, LLMs can capture the nuanced meanings of multi-word expressions. We systematically evaluated ChatGPT-4o's ability to predict concreteness, valence and arousal. In Study 1, ChatGPT-4o showed strong correlations with human concreteness ratings (r =.8) for multi-word expressions. In Study 2, these findings were repeated for valence and arousal ratings of individual words, matching or outperforming previous AI models. Study 3 extended the prevalence and arousal analysis to multi-word expressions and showed promising results despite the lack of large-scale human benchmarks. These findings highlight the potential of LLMs for generating valuable psycholinguistic data related to multiword expressions. To help researchers with stimulus selection, we provide datasets with AI norms of concreteness, valence and arousal for 126,397 English single words and 63,680 multi-word expressions.


Uncovering Political Bias in Emotion Inference Models: Implications for sentiment analysis in social science research

Plisiecki, Hubert, Lenartowicz, Paweł, Flakus, Maria, Pokropek, Artur

arXiv.org Artificial Intelligence

This paper investigates the presence of political bias in emotion inference models used for sentiment analysis (SA) in social science research. Machine learning models often reflect biases in their training data, impacting the validity of their outcomes. While previous research has highlighted gender and race biases, our study focuses on political bias - an underexplored yet pervasive issue that can skew the interpretation of text data across a wide array of studies. We conducted a bias audit on a Polish sentiment analysis model developed in our lab. By analyzing valence predictions for names and sentences involving Polish politicians, we uncovered systematic differences influenced by political affiliations. Our findings indicate that annotations by human raters propagate political biases into the model's predictions. To mitigate this, we pruned the training dataset of texts mentioning these politicians and observed a reduction in bias, though not its complete elimination. Given the significant implications of political bias in SA, our study emphasizes caution in employing these models for social science research. We recommend a critical examination of SA results and propose using lexicon-based systems as a more ideologically neutral alternative. This paper underscores the necessity for ongoing scrutiny and methodological adjustments to ensure the reliability and impartiality of the use of machine learning in academic and applied contexts.


Free Energy in a Circumplex Model of Emotion

Pattisapu, Candice, Verbelen, Tim, Pitliya, Riddhi J., Kiefer, Alex B., Albarracin, Mahault

arXiv.org Artificial Intelligence

Previous active inference accounts of emotion translate fluctuations in free energy to a sense of emotion, mainly focusing on valence. However, in affective science, emotions are often represented as multi-dimensional. In this paper, we propose to adopt a Circumplex Model of emotion by mapping emotions into a two-dimensional spectrum of valence and arousal. We show how one can derive a valence and arousal signal from an agent's expected free energy, relating arousal to the entropy of posterior beliefs and valence to utility less expected utility. Under this formulation, we simulate artificial agents engaged in a search task. We show that the manipulation of priors and object presence results in commonsense variability in emotional states.


Self context-aware emotion perception on human-robot interaction

Lin, Zihan, Cruz, Francisco, Sandoval, Eduardo Benitez

arXiv.org Artificial Intelligence

Emotion recognition plays a crucial role in various domains of human-robot interaction. In long-term interactions with humans, robots need to respond continuously and accurately, however, the mainstream emotion recognition methods mostly focus on short-term emotion recognition, disregarding the context in which emotions are perceived. Humans consider that contextual information and different contexts can lead to completely different emotional expressions. In this paper, we introduce self context-aware model (SCAM) that employs a two-dimensional emotion coordinate system for anchoring and re-labeling distinct emotions. Simultaneously, it incorporates its distinctive information retention structure and contextual loss. This approach has yielded significant improvements across audio, video, and multimodal. In the auditory modality, there has been a notable enhancement in accuracy, rising from 63.10% to 72.46%. Similarly, the visual modality has demonstrated improved accuracy, increasing from 77.03% to 80.82%. In the multimodal, accuracy has experienced an elevation from 77.48% to 78.93%. In the future, we will validate the reliability and usability of SCAM on robots through psychology experiments.


Language-based Valence and Arousal Expressions between the United States and China: a Cross-Cultural Examination

Cho, Young-Min, Pang, Dandan, Thapa, Stuti, Sherman, Garrick, Ungar, Lyle, Tay, Louis, Guntuku, Sharath Chandra

arXiv.org Artificial Intelligence

Although affective expressions of individuals have been extensively studied using social media, research has primarily focused on the Western context. There are substantial differences among cultures that contribute to their affective expressions. This paper examines the differences between Twitter (X) in the United States and Sina Weibo posts in China on two primary dimensions of affect - valence and arousal. We study the difference in the functional relationship between arousal and valence (so-called V-shaped) among individuals in the US and China and explore the associated content differences. Furthermore, we correlate word usage and topics in both platforms to interpret their differences. We observe that for Twitter users, the variation in emotional intensity is less distinct between negative and positive emotions compared to Weibo users, and there is a sharper escalation in arousal corresponding with heightened emotions. From language features, we discover that affective expressions are associated with personal life and feelings on Twitter, while on Weibo such discussions are about socio-political topics in the society. These results suggest a West-East difference in the V-shaped relationship between valence and arousal of affective expressions on social media influenced by content differences. Our findings have implications for applications and theories related to cultural differences in affective expressions.


Tollywood Emotions: Annotation of Valence-Arousal in Telugu Song Lyrics

Shanker, R Guru Ravi, Gupta, B Manikanta, Koushik, BV, Alluri, Vinoo

arXiv.org Artificial Intelligence

Emotion recognition from a given music track has heavily relied on acoustic features, social tags, and metadata but is seldom focused on lyrics. There are no datasets of Indian language songs that contain both valence and arousal manual ratings of lyrics. We present a new manually annotated dataset of Telugu songs' lyrics collected from Spotify with valence and arousal annotated on a discrete scale. A fairly high inter-annotator agreement was observed for both valence and arousal. Subsequently, we create two music emotion recognition models by using two classification techniques to identify valence, arousal and respective emotion quadrant from lyrics. Support vector machine (SVM) with term frequency-inverse document frequency (TF-IDF) features and fine-tuning the pre-trained XLMRoBERTa (XLM-R) model were used for valence, arousal and quadrant classification tasks. Fine-tuned XLMRoBERTa performs better than the SVM by improving macro-averaged F1-scores of 54.69%, 67.61%, 34.13% to 77.90%, 80.71% and 58.33% for valence, arousal and quadrant classifications, respectively, on 10-fold cross-validation. In addition, we compare our lyrics annotations with Spotify's annotations of valence and energy (same as arousal), which are based on entire music tracks. The implications of our findings are discussed. Finally, we make the dataset publicly available with lyrics, annotations and Spotify IDs.


Multi-Modality in Music: Predicting Emotion in Music from High-Level Audio Features and Lyrics

Krols, Tibor, Nikolova, Yana, Oldenburg, Ninell

arXiv.org Artificial Intelligence

API that makes a wide range of features accessible and therefore open to the public. This paper aims to test whether a multimodal But which features can actually predict the emotion approach for music emotion recognition of a song and how well perform Spotify's (MER) performs better than a unimodal annotations? Building on existing literature presented one on high-level song features in Section 2 we hypothesize that a multimodal and lyrics. We use 11 song features retrieved approach combining high-level auditory from the Spotify API, combined and lyrics-extracted features performs better than lyrics features including sentiment, TF-a uni-modal one (Y.-H. Yang, Lin, Cheng, et al., IDF and Anew to predict valence and 2008; Hu & Downie, 2010b, 2010a). We introduce arousal (Russell, 1980) scores on the our MER model in Section 3 before presenting Deezer Mood Detection Dataset (DMDD) and discussing the results of our exploratory (Delbouys et al., 2018) with 4 different regression and regression experiments in Sections 4 and 5. models.


Affective Idiosyncratic Responses to Music

CH-Wang, Sky, Li, Evan, Li, Oliver, Muresan, Smaranda, Yu, Zhou

arXiv.org Artificial Intelligence

Affective responses to music are highly personal. Despite consensus that idiosyncratic factors play a key role in regulating how listeners emotionally respond to music, precisely measuring the marginal effects of these variables has proved challenging. To address this gap, we develop computational methods to measure affective responses to music from over 403M listener comments on a Chinese social music platform. Building on studies from music psychology in systematic and quasi-causal analyses, we test for musical, lyrical, contextual, demographic, and mental health effects that drive listener affective responses. Finally, motivated by the social phenomenon known as w\v{a}ng-y\`i-y\'un, we identify influencing factors of platform user self-disclosures, the social support they receive, and notable differences in discloser user activity.